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Abstract 

Crash-only programs crash safely and recover quickly. 
There is only one way to stop such software—by crashing 
it—and only one way to bring it up—by initiating recov¬ 
ery. Crash-only systems are built from crash-only compo¬ 
nents, and the use of transparent component-level retries 
hides intra-system component crashes from end users. In 
this paper we advocate a crash-only design for Internet sys¬ 
tems, showing that it can lead to more reliable, predictable 
code and faster, more effective recovery. We present ideas 
on how to build such crash-only Internet services, taking 
successful techniques to their logical extreme. 


1. Occam’s Razor and the Restart Potpourri 

There are many reasons to restart software, and many 
ways to do it. Studies have shown that a main source of 
downtime in large scale software systems is caused by in¬ 
termittent or transient bugs [12, 20, 19, 1], Most non- 
embedded systems have a variety of ways to stop; for ex¬ 
ample, an operating system can shut down cleanly, panic, 
hang, crash, lose power, etc. 

When shutting down programs cleanly, unavailability 
consists of the time to shut down and the time to come back 
up; when crash-rebooting, unavailability consists only of the 
time to recover. Ironically, shutting down and reinitializing 
can sometimes take longer than recovering from a crash. Ta¬ 
ble 1 illustrates a casual comparison of reboot times; no im¬ 
portant data was lost in either of the experiments. 


System 

Clean reboot 

Crash reboot 

RedHat 8 (with ext3fs) 

104 sec 

75 sec 

JBoss 3.0 application server 

47 sec 

39 sec 

Windows XP 

61 sec 

48 sec 


Table 1. Duration of clean vs. crash reboots. 


It is impractical to build a system that is guaranteed to 
never crash, even in the case of carrier class phone switches 
or high end mainframe systems. Since crashes are unavoid¬ 
able, software must be at least as well prepared for a crash 
as it is for a clean shutdown. But then—in the spirit of Oc¬ 
cam’s Razor—if software is crash-safe, why support addi¬ 
tional, non-crash mechanisms for shutting down? A fre¬ 
quent reason is the desire for higher performance. 


For example, to avoid slow synchronous disk writes, 
many UNIX file systems cache metadata updates in mem¬ 
ory. As a result, when a UNIX workstation crashes, the 
file system reaches an inconsistent state that takes a lengthy 
fsck to repair, an inconvenience that could have been 
avoided by shutting down cleanly. This captures the design 
tradeoff that improves steady state performance at the ex¬ 
pense of shutdown and recovery performance. In the face 
of inevitable crashes, such a file system turns out to be brit¬ 
tle: a crash can lose data and, in some cases, the post-crash 
inconsistency cannot even be repaired. Not only do such 
performance tradeoffs impact robustness, but they also lead 
to complexity by introducing multiple ways to manipulate 
state, more code, and more APIs. The code becomes harder 
to maintain and offers the potential for more bugs—a fine 
tradeoff, if the goal is to build fast systems, but a bad idea if 
the goal is to build highly available systems. If the cost of 
such performance enhancements is dependability, perhaps 
it’s time to reevaluate our design strategy. 

In earlier work, we used recursive micro-reboots to im¬ 
prove the availability of a soft-state system that was trivially 
crash-safe [3]. In this paper we advocate a crash-only design 
(i.e., crash safety + fast recovery) for Internet systems, a 
class distinguished by the following properties: large scale, 
stringent high availability requirements, built from many 
heterogenous components, accessed over standard request- 
reply protocols such as HTTP, serving workloads that con¬ 
sist of large numbers of relatively short tasks that frame state 
updates, and subjected to rapid and perpetual evolution. We 
restrict our attention to single installations that reside inside 
one data center and do not span administrative domains. 

In high level terms, a crash-only system is defined by the 
equations stop=crash and start=recover. In the rest of the 
paper, we describe the benefits of the crash-only design ap¬ 
proach by analogy to physics, describe the internal proper¬ 
ties of components in a crash-only system, the architectural 
properties governing the interaction of components, and a 
restart/retry architecture that exploits crash-only design, in¬ 
cluding our work to date on a prototype using J2EE. 

2. Why Crash-Only Design ? 

Mature engineering disciplines rely on macroscopic de¬ 
scriptive physical laws to build and understand the behavior 
of physical systems. These sets of laws, such as Newto¬ 
nian mechanics, capture in simple form an observed physi¬ 
cal invariant. Software, however, is an abstraction with no 
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physical embodiment, so it obeys no physical laws. Com¬ 
puter scientists have tried to use prescriptive rules, such as 
formal models and invariant proofs, to reason about soft¬ 
ware. These rules, however, are often formulated relative to 
an abstract model of the software that does not completely 
describe the behavior of the running system (which includes 
hardware, an operating system, runtime libraries, etc.). As a 
result, the prescriptive models do not provide a complete de¬ 
scription of how the implementation behaves in practice, be¬ 
cause many physically possible states of the complete sys¬ 
tem do not correspond to any state in the abstract model. 

With the crash-only property, we are trying to impose, 
from outside the software system, macroscopic behavior 
that coerces the system into a simpler, more predictable uni¬ 
verse with fewer states and simpler invariants. Each crash- 
only component has a single idempotent “power-off switch” 
and a single idempotent “power-on switch”; the switches for 
larger systems are built by wiring together their subsystems’ 
switches in ways described by section 3. A component’s 
power-off switch implementation is entirely external to the 
component, thus not invoking any of the component’s code 
and not relying on correct internal behavior of the compo¬ 
nent. Examples of such switches include kill -9 sent to 
a UNIX process, or turning off the physical, or virtual, ma¬ 
chine that is running some software inside it. 

Keeping the power-off switch mechanism external to 
components makes it a high confidence “component 
crasher.” Consequently, every component in the system 
must be prepared to suddenly be deactivated. Power-off and 
power-on switches provide a very small repertoire of high- 
confidence, simple behaviors, leading to a small state space. 
Of course, the “virtual shutdown” of a virtual machine, even 
if invoked with kill -9, has a much larger state space 
than the physical power switch on the workstation, but it 
is still vastly simpler than the state space of a typical pro¬ 
gram hosted in the VM, and it does not vary for different 
hosted programs. Indeed, the fact that virtual machines are 
relatively small and simple compared to the programs they 
host has been successfully invoked as an argument for using 
VMs for inter-application isolation [26]. 

Recovery code deals with exceptional situations, and 
must run flawlessly. Unfortunately, exceptional situations 
are difficult to handle, occur seldom, and are not trivial to 
simulate during development; this often leads to unreliable 
recovery code. In crash-only systems, however, recovery 
code is exercised every time the system starts up, which 
should ultimately improve its reliability. This is particularly 
relevant given that the rate at which we reduce the number 
of bugs per Klines of code lags behind the rate at which 
the number of Klines per system increases, with the net re¬ 
sult being that the number of bugs in an evolving system 
increases with time [7]. More bugs mean more failures, and 
systems that fail more often will need to recover more often. 

Many of the benefits resulting from a crash-only design 
have been previously obtained in the data storage/retrieval 
world with the introduction of transactions. Our approach 


aims for a similar effect on the failure properties of Internet 
systems—crash-only design is in many ways a generaliza¬ 
tion of the transaction model. It is important to note that In¬ 
ternet applications do not have to use transactions in order 
to be crash-only; in fact, ACID semantics can sometimes 
lead to overkill. For example, session data accumulates 
information at the server over a series of user service re¬ 
quests, for use in subsequent operations. It is mostly single- 
reader/single-writer, thus not requiring ordering and concur¬ 
rency control. The richness of a query language like SQL is 
unnecessary, and session state usually does not persist be¬ 
yond a few minutes. These observations are leveraged by 
SSM [18], a crash-only hashtable-like session state store. 

A crash-only system makes it affordable to transform 
every detected failure into component-level crashes; this 
leads to a simple fault model, and components only need 
to know how to recover from one type of failure. For ex¬ 
ample, [21] forced all unknown faults into node crashes, 
allowing the authors to improve the availability of a clus- 
terized web server. Existing literature often assumes unreal¬ 
istic fault models (e.g., that failures occur according to well- 
behaved tractable distributions); a crash-only design enables 
aggressive enforcement of such desirable fault models, thus 
increasing the impact of prior work. If we state invariants 
about the system’s failure behavior and make such behavior 
predictable, we are effectively coercing reality into a small 
universe governed by well-understood laws. 

Moreover, a system in which crash-recovery is cheap 
allows us to micro-reboot suspect components before 
they fail. By aggressively employing software rejuvena¬ 
tion [16]—rebooting in order to stave off failure induced 
by resource exhaustion—we can prevent failures altogether. 
The system can immediately trigger component-level re¬ 
juvenation whenever it notices fail-stutter behavior [2], a 
trough in offered workload, or based on predictive mathe¬ 
matical models of software aging [10]. 

Finally, if we admit that most failures can be recovered 
by micro-rebooting, crashing every suspicious component 
could shorten the fault detection and diagnosis time—a pe¬ 
riod that sometimes lasts longer than repair itself. 

3. Properties of Crash-Only Software 

In this section we describe a set of properties that we 
deem sufficient for a system to be crash-only. In some sys¬ 
tems, some of these properties may not be necessary for 
crash-only behavior. 

To make components crash-only, we require that all im¬ 
portant non-volatile state be kept in dedicated state stores, 
that state stores provide applications with the right abstrac¬ 
tions, and that state stores be crash-only. To make a sys¬ 
tem of interconnected components crash-only, it must be de¬ 
signed so that components can tolerate the crashes and tem¬ 
porary unavailability of their peers. This means we require 
strong modularity with relatively impermeable component 
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boundaries, timeout-based communication and lease-based 
resource allocation, and self-describing requests that carry 
a time-to-live and information on whether they are idem- 
potent. Many Internet systems today have some subset of 
these properties, but we do not know of any that combines 
all properties into a true crash-only system. 

In section 4 we will show how crash-only components 
can be glued together into a robust Internet system based on 
a restart/retry architecture; in the rest of this section we de¬ 
scribe in more detail the properties of crash-only systems. 
Some relate to intra-component state management, while 
others relate to inter-component interactions. We recognize 
that some of these sacrifice performance, but we strongly 
believe the time has come for robustness to reclaim its sta¬ 
tus as a first-class citizen. 

3.1. Intra-Component Properties 

In today’s Internet applications there are a small num¬ 
ber of types of state: transactional persistent state, single- 
reader/single-writer persistent state (e.g., user profiles, that 
almost never see concurrent access), expendable persistent 
state (server-side information that could be sacrificed for the 
sake of correctness or performance, such as clickstream data 
and access logs), session state (e.g., the result set of a previ¬ 
ous search, subject to refinement), soft state (state that can 
be reconstructed at any time based on other data sources), 
and volatile state. While differentiated mostly by guaran¬ 
teed lifetime, the requirements for these categories of state 
lead to qualitatively different implementations. 

All important non-volatile state is managed by dedica- 
ted state stores , leaving applications with just program 
logic. Specialized state stores (e.g., relational and object- 
oriented databases, file system appliances, distributed data 
structures [14], non-transactional hashtables [15], session 
state stores [18], etc.) are much better suited to manage 
state than code written by developers with minimal training 
in systems programming. Applications become stateless 
clients of the state stores, which allows them to have 
simpler and faster recovery routines. A popular example 
of such separation can be found in three-tier Internet 
architectures, where the middle tier is largely stateless and 
relies on backend databases to store data. 

These state stores must also be crash-only, otherwise the 
problem has just moved down one level. Many commercial 
off-the-shelf state stores available today are crash-safe (i.e., 
they can be crashed without loss of data), such as databases 
and the various network-attached storage devices, but most 
are not crash-only, because they recover slowly. Many prod¬ 
ucts, however, offer tuning knobs that permit the administra¬ 
tor to trade performance for improved recovery time, such as 
making checkpoints more often in the Oracle DBMS [17]. 
An example of a pure crash-only state store is the Postgres 
database system [25], which uses non-overwriting storage 
and maintains all data in a single append-only log. Although 


it trades away some performance, Postgres achieves practi¬ 
cally instantaneous recovery, because it only needs to mark 
the uncommitted transactions as aborted. 

The abstractions and guarantees provided by state stores 
must be congruent with application requirements. This 
means that the state abstraction exported by the state store is 
not too powerful (e.g., offering a SQL interface with ACID 
semantics for storing and retrieving simple key-value tu¬ 
ples) and not too weak. A state abstraction that is too weak 
will require client components to do some amount of state 
management, such as implementing a customer record ab¬ 
straction over an offered file system interface. Good state 
abstractions allow applications to operate at their “natural” 
semantic level. Offering the weakest state guarantees that 
satisfy the application allows us to exploit application se¬ 
mantics and build simpler, faster, more reliable state stores. 

For example, Berkeley DB [22] is a storage system sup¬ 
porting B+tree, hash, and record abstractions. It can be ac¬ 
cessed through four different interfaces, ranging from no 
concurrency control/no transactions/no disaster recovery to 
a multi-user, transactional API with logging, fine-grained 
locking, and support for data replication. Applications can 
use the abstraction that is right for their purposes and the 
underlying state store optimizes its operation to fit those re¬ 
quirements. Workload characteristics can also be leveraged 
by state stores; e.g., expecting a read-mostly workload al¬ 
lows a state store to utilize write-through caching, which 
can significantly improve recovery time and performance. 

We do not advocate that every application have its own 
set of state stores. Instead, we believe Internet systems will 
standardize on a small number of state store types: ACID 
stores (e.g., databases for customer and transaction data), 
non-transactional persistent stores (e.g., DeStor [15], a 
crash-only system specialized in handling non-transactional 
persistent data, like user profiles), session state stores (e.g., 
SSM [18] for shopping carts), simple read-only stores (e.g., 
hie system appliances for static HTML and images), and 
soft state stores (e.g., web caches). If we think carefully 
about the state abstractions required by each application 
component and use suitable state stores, we can make these 
components crash-only. 

3.2. Inter-Component Properties 

Subsystems that crash on their own, or that are explic¬ 
itly crash-rebooted for recovery, will temporarily become 
unavailable to serve requests. For a crash-only system to 
gracefully tolerate such behavior, we need to decouple com¬ 
ponents from each other, from the resources they use, and 
from the requests they process. 

Components have externally enforced boundaries that 
provide strong fault containment. The desired isolation can 
be achieved with virtual machines, isolation kernels [26], 
task-based intra-JVM isolation [24, 8], OS processes, etc. 
Indeed, the Denali isolation kernel is designed for such en- 
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capsulation. Web hosting service providers often use mul¬ 
tiple virtual machines on one physical machine to offer 
their clients individual web servers they can administer at 
will, without affecting other customers. The boundaries be¬ 
tween components delineate distinct, individually recover¬ 
able stages in the processing of requests. 

All interactions between components have a timeout. 
This includes explicit communication as well as RPC: if no 
response is received to a call within the allotted timeframe, 
the caller assumes the callee has failed and reports it to 
a recovery agent [5], which can crash-restart the callee 
if appropriate. Crash-restarting helps ensure the called 
component is in a known state; this is safe because the 
component is crash-safe and crash-restart is idempotent. 
Timeouts provide an orthogonal mechanism for turning 
all non-Byzantine failures, both at the component level 
and at the network level, into fail-stop events (i.e., the 
failed entity either provides results or is stopped), even 
though the components are not necessarily fail-stop. Such 
behavior is easier to accomodate, and containment of faults 
is improved. 

Crash-only components recover quickly, thus recovery 
is very cheap. Under such circumstances, it becomes ac¬ 
ceptable for the recovery manager to crash-recover suspect 
components even when it lacks the certainty that those com¬ 
ponents have indeed failed; the downtime risk of letting the 
components run may be higher than crash-rebooting healthy 
components. 

All resources are leased , rather than permanently allo¬ 
cated, to ensure that resources are not coupled to the compo¬ 
nents using them. Resources include many types of persis¬ 
tent state, such as account profiles for a free e-mail provider: 
every time the user logs in, a 6-month lease is renewed; 
when the lease expires, all associated data can be purged 
from the system. It also includes CPU resources: if a com¬ 
putation is unable to renew its execution lease, it is termi¬ 
nated by a high confidence watchdog [9]. For example, in 
PHP, a server-side scripting language used for writing dy¬ 
namic web pages, runaway scripts are killed and an error is 
returned to the web browser. Leases [11] give us the ability 
to reason about the conditions that hold true of the system’s 
resources after a lease expires. Infinite timeouts or leases 
are not acceptable; the maximum-alio wed timeout and lease 
are specified in an application-global policy. This way it is 
less likely that the system will hang or become blocked. 

Requests are entirely self-describing, by making the 
state and context needed for their processing explicit. This 
allows a fresh instance of a rebooted component to pick up 
a request and continue from where the previous instance 
left off. Requests also carry information on whether they 
are idempotent, along with a time-to-live; both idempo- 
tency and TTL information can initially be set at the system 
boundary, such as in the web tier. For example, the TTL 
may be determined by load or service level agreements, and 
idempotency flags can be based on application-specific in¬ 


formation (which can be derived, for instance, from URL 
substrings that determine the type of request). Many in¬ 
teresting operations in an Internet service are idempotent, 
or can easily be made idempotent by keeping track of se¬ 
quence numbers or by wrapping requests in transactions; 
some large Internet services have already found it practi¬ 
cal to do so [23]. Over the course of its lifetime, a request 
will split into multiple sub-operations, which may rejoin, 
in much the same way nested transactions do. Recovering 
from a failed idempotent sub-operation entails simply reis¬ 
suing it; for non-idempotent operations, the system can ei¬ 
ther roll them back, apply compensating operations, or tol¬ 
erate the inconsistency resulting from a retry. Such trans¬ 
parent recovery of the request stream can hide intra-system 
component failures from the end user. 

4. A Restart/Retry Architecture 


A component infers failure of a peer component either 
based on a raised exception or a timeout. When a compo¬ 
nent is reported failed, a recovery agent may crash-reboot 
it; the idempotency of crash-shutdown makes this an inex¬ 
pensive way to ensure the component is indeed turned off 
before attempting recovery. Components waiting for an an¬ 
swer from the restarted component receive a RetryAfter(n) 
exception, indicating that the in-flight requests can be re¬ 
submitted after n msec (the estimated time to recover); this 
exception is purely an optimization because, in its absence, 
components would have timeouts as a fallback mechanism. 
If the request is idempotent and its time-to-live allows it 
to be resubmitted, then the requesting component does so. 
Otherwise, a failure exception is propagated up the request 
chain until either a previous component decides to resub¬ 
mit, or the client needs to be notified of the failure. The web 
front-end issues an HTTP/1.1 Retry-After directive to 
the client with an estimate of the time to recover, and retry- 
capable clients can resubmit the original HTTP request. 
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Figure 1. A simple restart/retry architecture. 

In Figure 1 we show a simple restart/retry example, in 
which a request to view a shopping cart splits inside the 
application server into one subrequest to a stateful session 
EJB (Enterprise JavaBean) that communicates with a ses¬ 
sion state store and another subrequest to a stateless session 
EJB. Should the state store become unavailable, the appli¬ 
cation server either receives a Retry After exception or times 
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out, at which time it can decide whether to resubmit the re¬ 
quest to SSM or not. Within each of the subsystems shown 
in Figure 1, we can imagine each subrequest further split¬ 
ting into finer grain subrequests submitted to the respec¬ 
tive subsystems’ components. We have implemented crash¬ 
restarting of EJBs in a J2EE application server; an EJB-level 
micro-reboot takes less than a second [5]. 

Timeout-based failure detection is supplemented with 
traditional heartbeats and progress counters. The 
counters—compact representations of a component’s pro¬ 
cessing progress—are usually placed at state stores and 
in messaging facilities, where they can map state access 
and messaging activity into per-component progress. Many 
existing performance monitors can be transformed into 
progress monitors by augmenting them with request origin 
information. Components themselves can also implement 
progress counters that more accurately reflect application 
semantics, but they are less trustworthy, because they are 
inside the components. 

The dynamics of loosely coupled systems can sometimes 
be surprising. For example, resubmitting requests to a com¬ 
ponent that is recovering can overload it and make it fail 
again; for this reason, the RetryAfter exceptions provide 
an estimated time-to-recover. This estimated value can be 
used to spread out request resubmissions, by varying the re¬ 
ported time-to-recover estimate across different requestors. 
A maximum limit on the number of retries is specified in the 
application-global policy, along with the lease durations and 
communication timeouts. These numbers can be dynami¬ 
cally estimated based on historical information collected by 
a recovery manager [5], or simply captured in a static de¬ 
scription of each component, similar to deployment descrip¬ 
tors for EJBs. In the absence of such hints, a simple load 
balancing algorithm or exponential backoff can be used. 

To prevent reboot cycles and other unstable conditions 
during recovery, it is possible to quiesce the system when 
a set of components is being crash-rebooted. This can be 
done at the communication/RPC layer, or for the system 
as a whole. In our prototype, we use a stall proxy [5] in 
front of the web tier to keep new requests from entering the 
system during the recovery process. Since Internet work¬ 
loads are typically made of short running requests, the stall 
proxy transforms brief system unavailability into temporar¬ 
ily higher latency for clients. We are exploring modifica¬ 
tions to the Java RMI layer that would allow finer grain re¬ 
quest stalling. 

5. Discussion 

Building crash-only systems is not easy; the key to 
widespread adoption of our approach will require employ¬ 
ing the right architectural models and having the right tools. 
With the recent success of component-based architectures 
(e.g., J2EE and .Net), and the emergence of the application 
server as an operating system for Internet applications, it is 


possible to provide many of the crash-only properties in the 
platform itself. This would allow all applications running 
on that platform to take advantage of the effort and become 
crash-only. 

We are applying the principles described here to an open- 
source Java 2 Enterprise Edition (J2EE) application server. 
We are separating the individual J2EE services (naming, di¬ 
rectory lookup, messaging, etc.) into well-isolated com¬ 
ponents, implementing requests as self-describing contin¬ 
uations, modifying the RMI layer to allow for timeout- 
based operation, modifying the EJB containers to imple¬ 
ment lease-based resource allocation, and integrating non¬ 
transactional state stores like DeStor and SSM. A first step 
in this direction is described in [5]. 

We are focusing initially on applications whose work¬ 
loads can be characterized as relatively short-running tasks 
that frame state updates. Substantially all Internet services 
fit this description, in part because the nature of HTTP has 
forced designers into this mold. As enterprise services and 
applications (e.g., workflow, customer management) be¬ 
come web-enabled, they adopt similar architectures. We ex¬ 
pect there are many applications outside this domain that 
could not easily be cast this way, and for which deriving a 
crash-only design would be impractical or infeasible. 

In order for the restart/retry architecture to be highly 
available and correct, most requests it serves must be idem- 
potent. This requirement might be inappropriate for some 
applications. Our proposal does not handle Byzantine fail¬ 
ures or data errors, but such behavior can be turned into fail- 
stop behavior using well-known orthogonal mechanisms, 
such as triple modular redundancy [13] or clever state repli¬ 
cation [6]. 

In today’s Internet systems, fast recovery is obtained by 
overprovisioning and counting on rapid failure detection to 
trigger failover. Such failover can sometimes successfully 
mask hours-long recovery times, but often detecting failures 
end-to-end takes longer than expected. Crash-only software 
is complementary to this approach and can help alleviate 
some of the complex and expensive management require¬ 
ments for highly redundant hardware, because faster recov¬ 
ering software means less redundancy is required. In addi¬ 
tion, a crash-only system can reintegrate recovered compo¬ 
nents faster, as well as better accommodate removed, added, 
or upgraded components. 

We expect throughput to suffer in crash-only systems, but 
this concern is secondary to the high availability and pre¬ 
dictability we expect in exchange. The first program written 
in a high-level language was certainly slower than its hand- 
coded assembly counterpart, yet it set the stage for software 
of a scale, functionality and robustness that had previously 
been unthinkable. These benefits drove compiler writers to 
significantly optimize the performance of programs written 
in high-level languages, making it hard to imagine today 
how we could program otherwise. We expect the benefits of 
crash-only software to similarly drive efforts that will erase, 
over time, the potential performance loss of such designs. 
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6. Conclusion 

By using a crash-only approach to building software, we 
expect to obtain better reliability and higher availability in 
Internet systems. Application fault models can be simplified 
through the application of externally-enforced “crash-only 
laws,” thus encouraging simpler recovery routines which 
have higher chances of being correct. Writing crash-only 
components may be harder, but their simple failure behav¬ 
ior can make the assembly of such components into large 
systems easier. 

The promise of a simple fault model makes stating in¬ 
variants on failure behavior possible. A system whose 
component-level and system-level invariants can be en¬ 
forced through crash-rebooting is more predictable, making 
recovery management more robust. It is our belief that ap¬ 
plications and services with high availability requirements 
can significantly benefit from these properties. 

Once we surround a crash-only system with a suitable 
recovery infrastructure, we obtain a recursively restartable 
system [4]. Transparent recovery based on component- 
level micro-reboots enables restart/retry architectures to 
hide intra-system failure from the end users, thus improving 
the perceived reliability of the service. We find it encourag¬ 
ing that our initial prototype [5] was able to complete 78% 
more client requests under faultload than a non-crash-only 
version of the system that did not employ micro-reboots for 
recovery. 
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